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Abstract 


We consider a supply chain, which consists of several retailers and one supplier. The 
retailers, which possibly differ in their cost and demand parameters, may be coordinated 
through replenishment strategies and transshipments, that is, movement of a product 
among the locations at the same echelon level. We prove that in order to minimize the 
expected long-run average cost for this system, an optimal replenishment policy is for each 
retailer to follow an order-up-to S policy. Furthermore, we demonstrate how the values of 
the order-up-to quantities can be calculated using a sample-path-based optimization 
procedure. Given an order-up-to S policy, we show how to determine an optimal 
transshipment policy, using an LP/Network flow framework. Such a combined numerical 


approach allows us to study complex and large systems. 


Subject Classifications: Inventory/Production: multi-location transshipments, 


Simulation: infinitesimal perturbation analysis (IPA) 


* This research was performed, in part, while the author was at the Department of Industrial Engineering, Tel Aviv 
University. 
* Currently visiting the IE/MS department at Northwestern University. 


1. Introduction 


Physical pooling of inventories (Eppen 1979) has been widely used in practice to reduce cost and 
improve customer service. For example, Xerox has consolidated all of its country-based 
warehouses in Europe into a single European Logistics Center in the Netherlands. On the other 
hand, the practice of transshipment, the monitored movement of material between locations at 
the same echelon (e.g., among retailers), may entail the sharing of stock through enhanced 
visibility, but without the need to locate the stock physically in the same location. To emphasize 
the requirement for supply chain transparency at the same echelon, we will refer to this practice 
as information pooling. Such information pooling through transshipments has been less 
frequent. Transshipments provide an effective mechanism for correcting discrepancies between 
the locations’ observed demand and their available inventory. As a result, transshipments may 
lead to cost reductions and improved service without increasing system-wide inventories. In this 
paper, we study transshipments as an effective materials management policy. 

Consider the following examples. Suppose that you go shopping at Foot Locker in 
Hamburg, Germany. You find a pair of Avanti Leather shoes, but, to your disappointment, they 
do not have your size. Knowing that it would take at least a few weeks to get the shoes that you 
desire from the Italian manufacturer, you get ready to leave the store disappointed. However, a 
sales representative quickly determines, through a simple check on the store’s computer, that the 
Foot Locker in Antwerp, Belgium, has the shoes in your size. As she arranges to have the shoes 
sent overnight, she suggests that you come back the next day to try them on. 

FNAC is the leading retailer of cultural and leisure products in France. The company has 
recently opened an on-line channel, fnac.com, in addition to its vast network of retail shops. 
Upon the receipt of an order from the Internet, there are several options for order fulfillment: 
Jnac.com’s own stock, stock kept at a central distribution center, and stock from nearby FNAC 
stores. The last option represents one-way transshipments as physical inventory held at a store is 
used to satisfy the demand at fnac.com instead of ordering the item from the central distribution 
center or from its supplier. Although fulfilling customer demand through transshipments has a 
higher short term operational cost, the supply chain manager of the company asserts that 
exercising the transshipment option expands the portfolio of items they can offer through the 


Internet threefold without having to carry the associated stock. 


In manufacturing environments, the management of spare parts inventories is crucial. In 
electronics, for instance, while spare parts are expensive, the lack of a key spare part can halt the 
production, resulting in a significant loss of output and the associated revenue. Traditionally, 
electronics manufacturers hold their own spare parts inventory and parts are not shared among 
manufacturers. Through information sharing and transshipments, however, these companies can 
expect to achieve significant inventory reductions and increased availability levels. ASML, a 
Dutch manufacturer of wafer steppers, reports that its Japanese clients (e.g., big chip 
manufacturers such as NEC, Toshiba, etc.) frequently transship spare parts among themselves 
before placing a new replenishment order with ASML. Such transshipments are also very 
common among airlines for the so-called maintenance-repair-operations (MRO) items. 

In the above examples, transshipments are sometimes used in a reactive mode (in response to 
an actual stockout). Alternatively, companies may realize that increased benefits can be 
achieved by proactively incorporating the transshipment option into the planning phase. Planned 
and systematic transshipments represent a relatively novel idea. They replace physical 
consolidation with virtual integration through information sharing. In this paper, we propose a 
model that allows the exploitation of the advantages of deploying transshipments in a proactive 
fashion. 

There are two key reasons why information pooling has not yet been widely adopted in 
practice: the inadequacy of the IT infrastructure and the lack of realistic models exploiting the 
benefits of this policy. While the past decade has seen significant investment in IT infrastructure 
(e.g., implementation of ERP systems and other web-based technologies) enabling transparency 
within supply chains, new business models of transshipments have not been developed as 
rapidly. The literature on transshipments has generally addressed either problems with two 
retailers, e.g., Tagaras (1989), Tagaras and Cohen (1992), Robinson (1990) and Herer and Tzur 
(2001) or problems with multiple, mainly identical retailers, e.g., Krishnan and Rao (1965) and 
Robinson (1990). Herer and Tzur (2003) considered non-identical multi-retailers in a 
deterministic setting. In contrast, we consider multiple retailers, who are allowed to differ from 
each other both in their cost structure and in their demand parameters, in a stationary infinite 
horizon setting. In addition we allow demand to be dependent across retailers within any 


particular period. Other recent work on transshipments includes Archibald, Sassen and Thomas 


(1997), Tagaras (1999), Herer and Rashit (1999), Dong and Rudi (2000), and Rudi, Kapur and 
Pyke (2001). 

The paper most closely related to ours is that of Robinson (1990). For the model considered, 
it provides an analytical solution when there are two non-identical locations and when there are 
multiple identical (in cost) locations. Additionally, it contains a heuristic for the multi-location 
non-identical case, which is stochastic integration based on Monte Carlo sampling. Robinson 
also contains a mathematical program that is close to the mathematical program presented here in 
Section 3.2. Despite these similarities there are important differences both in our models and 
approaches. The cost parameters that we allow are more general. Also, Robinson considers 
minimizing expected discounted cost and we consider expected long-run average cost, where 
cost includes holding, shortage, transshipment and replenishment costs. Most importantly, our 
approach is guaranteed to converge to the optimal values, whereas Robinson's heuristic, even 
though it performs very well, provides no such guarantees. 

Some of the recent papers have incorporated high levels of complexity into transshipment 
models. Unfortunately, such models become intractable rather quickly, leaving simulation as the 
only tool for investigating interesting policies. Crude simulation, however, can be very time 
consuming. We therefore propose to combine the modeling flexibility of simulation with 
stochastic optimization approaches. Simulation-based optimization techniques help the search 
for an improved policy while allowing for complex features that are typically outside of the 
scope of analytical models. 

In particular, we show that an optimal policy for the system we consider is for each retailer to 
follow an order-up-to policy. The optimality of the order-up-to policy takes into consideration 
the use of transshipments among retailers, to be performed once demand is observed. While we 
also show how to find optimal transshipment quantities, an order-up-to policy remains optimal 
under any stationary transshipment policy. This result is useful when considering what-if 
scenarios, for example, when transshipments are performed only within clusters of locations. 
We also demonstrate how the values of the order-up-to quantities can be calculated using a 
procedure that is based on Infinitesimal Perturbation Analysis, IPA (Ho et al. 1979). 

While the optimal order-up-to quantities have to be found once for the entire system, an 
optimal transshipment strategy has to be determined on a period-by-period basis, given the 


period’s demand realization. We also show how these transshipment quantities can be found 


using an LP / Network flow framework. Moreover, we show that we can find an optimal 
inventory replenishment policy for any stationary transshipment policy that may arise from 
practical considerations. This enables the comparison of several such alternatives as well as a 
comparison of each alternative with the optimal solution. 

The contribution of this paper is twofold. First is the development of an integrated IPA/LP 
algorithm for a system that allows transshipments. The system we consider differs from many 
previously studied systems with transshipments in that we consider multiple retailers, which 
differ both in their cost structure and in their demand parameters. Moreover, we show that we 
can find an optimal inventory replenishment policy for any stationary transshipment policy that 
may arise from practical considerations. This enables the comparison of several such 
alternatives, as well as a comparison of each alternative with the optimal solution. Second is a 
methodological contribution obtained by formulating and validating IPA derivative estimators 
for the transshipment problem. The estimators are based solely on data from the operation of a 
system at a single set of parameter values. Therefore, they are easily computed from the sample 
path generated by a simulation run. Formulating these estimates means introducing appropriate 
algorithms; validating them calls for showing that they converge to the correct values, where 
convergence is over the number of independent simulation replications (obtained, in our case, 
over regenerative cycles) used to estimate the derivative information. 

The paper is organized as follows: In Section 2 we describe the multi-location transshipment 
problem and introduce notation. In Section 3 we present the form of a combined optimal policy 
for the replenishment and transshipment strategies, together with our solution technique. In 
Section 4 we discuss the numerical study, which illustrates the solution technique. Section 5 


concludes the paper. 


2. Problem Description 


2.1 The Model 
In the system being investigated, there are one supplier and N non-identical retailers, associated 
with N distinct stocking locations, facing the customer demand. The demand distribution at each 


retailer in a period is assumed to be known and stationary over time. The system inventory is 


reviewed periodically, and replenishment orders are placed with the supplier. In any period, 
transshipments provide a means to reconcile demand-supply mismatches. 

Within each period, events occur in the following order: first, replenishment orders placed 
with the supplier in the previous period arrive. These orders are used to satisfy any outstanding 
backlog and to increase the inventory level. Next in the period is the occurrence of demand. 
Since demand realization represents the only uncertain event of the period, once it is observed all 
the decisions of the period, i.e., transshipment and replenishment quantities are made and paid 
for. The transshipment transfers are then made immediately, and subsequently the demand is 
satisfied. Unsatisfied demand is backlogged. At this point, backlogs and inventories are 
observed, and penalty and holding costs, respectively, are incurred. The inventory is carried, as 
usual, to the next period. 

The goal is to find the transshipment and replenishment quantities that minimize the 
expected long-run average cost over an infinite horizon. The cost is the sum of the 
replenishment, transshipment, holding, and penalty costs. Note that items, which are supplied 
through transshipments, satisfy demand immediately, while backlogged items have to wait until 
the beginning of the next period. Thus, the advantage of using transshipments is in gaining a 
source of supply whose reaction time is shorter than that of the regular supply. 

To describe the operation of the system, we use the following notation. 

N = number of retailers; 


D. = random variable associated with the periodic demand at retailer 7 with E[Dj] <oo; 


I 


f(D) = joint probability density function for the demand vector D; 


d, = actual demand at retailer 7 in an arbitrary period; 

h, = holding cost incurred at retailer 7 per unit held per period; 

P; = penalty cost incurred at retailer i per unit backlogged per period; 

Cc, = replenishment cost per unit at retailer i; 

Cs = direct transshipment cost per unit transshipped from retailer i to retailer j; 


ie) 
I 


effective transshipment cost, or simply the transshipment cost, per unit transshipped 


from retailer i to retailer j, c,, =¢, +¢;—C;. 


We will represent the vector of quantities described above, as well as the ones that we will 
introduce later in the paper, by dropping the subscripts, thus, d = (d,,...,d,,). 

Note that c, is considered as the effective transshipment cost because when a unit is 
transshipped from retailer 7 to retailer j we pay, in addition to the direct transshipment cost, a cost 
of c, instead of c, to replenish the unit. Further note that c,, can be less than zero. Even though 
we would expect such a situation to be rare, it can be handled by our analysis without any 
modification. In fact, we would expect that in most situations c, =c j is satisfied, that is, 
Cy =¢;- In this case, the differences, if any, between various h, values result solely from 


retailers’ physical and geographical characteristics. For example, the size of the warehouse and 
its material handling efficiency, or whether the retailer is in an expensive business area or in a 


rural suburb may affect the cost structure. In any case, we observe that c,, +c, = ¢, +¢, >0. 
This implies that it is not optimal to transship items back-and-forth. 

We consider base stock policies, where S, represents the order-up-to level at retailer i. 
Given d,, the actual demand at retailer i in a given period, the dynamic behavior of the system is 
captured through the following auxiliary variable: 


I, = inventory level at retailer i immediately after transshipments and demand satisfaction 


Ll 


where F’, ,, represents the transshipment quantity from retailer i to retailer 7 (the motivation for 


this notation will become apparent below with a concise definition given in Table 1). Note that 
I, may be either positive or negative, and we denote: 
I; =max{/,,0}, 1; = max{-—I,,0}. 
Thus, the realized cost of the system in a given period is equal to: 
N N N N N 
TC= >) 6, Fou +>, FD Bile G4, (1) 
i=l j=l i=l i=l i=l 
We show, in Section 3.1, that base stock policies minimize the expected long-run average 


cost. Since the optimal policy is to order-up-to S$, units at each retailer i, the beginning of each 


period, after orders arrive and backorders are satisfied, is a regeneration point. That is, the 


system returns to the same state (S, units at each retailer). Thus, we can view the multi-period 


problem as a series of single-period problems. In particular, minimizing the expected cost in an 
arbitrary period will also minimize our objective function, the expected long-run average cost. 
Furthermore, this regenerative structure enables the construction of an efficient algorithm to 


compute the optimal order-up-to values. The algorithm is introduced in Section 3.3. 


In Equation 1, the term },c,d, is needed to fully account for the replenishment costs. 
Since we are using an “Order-up-to S,” replenishment policy at each retailer, the total amount 


replenished system-wide will be exactly equal to the system-wide demand. Since this term is 
independent of our decision variables, it is omitted below. Recall, that the replenishment cost 


differentials were included in the definition of Cy- 


2.2 Modeling Assumptions 


We will make mild assumptions, one regarding the replenishment policy and two regarding the 


transshipment policy, but first we need three definitions. 


Definition 1 A replenishment policy is shortage inducing if and only if the beginning inventory, 
after orders arrive and backorders are satisfied, at some retailer can be strictly negative. 
Moreover, a replenishment policy, which is not shortage inducing, is termed non-shortage 


inducing. 


Definition 2 A transshipment policy is stationary if and only if the transshipment quantities 
decision is independent of the period in which it is made. That is, it depends only on the pre- 
transshipment inventory and the observed demand. Similarly, a replenishment policy is 
stationary if and only if the replenishment decision is independent of the period in which it is 


made. 


Definition 3. A transshipment policy is a no-buildup transshipment policy if and only if 
transshipments are never made to buildup inventory at the receiving location, that is, 


transshipments are only made to satisfy actual current demand. 


We consider only replenishment policies that are non-shortage inducing and 
transshipment policies that are both stationary and no-buildup. The non-shortage inducing 


assumption is needed to eliminate some pathological situations where the order-up-to quantity is 


negative; moreover this assumption is easily justified from a service level standpoint. A 
customer may accept a shortage from time to time, but not ordering enough to satisfy an existing 
shortage (as a shortage inducing policy may do) would not be a sustainable business decision. 
The stationary assumption is made without loss of generality since our planning horizon is 
infinite and both demand and the cost parameters are stationary, implying that we need only 
consider replenishment and transshipment policies that are stationary. The no-buildup property 
is guaranteed (see Corollary 1 to Theorem | below) if we assume (as was assumed in Tagaras 
(1989), Robinson (1990) and Herer and Rashit (1999) as well as others) the following 


relationship regarding the problem parameters: 
h, Se, +h, for all i and j. (2) 


Intuitively, this inequality means that it is not economical to transfer a unit from retailer i 
to retailer 7, so that it would be held in inventory at retailer j rather than at retailer 7. Several 
other assumptions that are often made in the literature on transshipments and/or appear to be 


natural are not required here; see Section 3.4. 


3. Optimal Policies 


Two decisions need to be made each and every period: replenishment and transshipment 
quantities. Those are discussed, respectively, in Section 3.1, where an order-up-to policy is 
proven to be optimal for the replenishment decision, and in Section 3.2, where an LP/network 
flow formulation is developed for the transshipment decision. In Section 3.3 we discuss how the 
optimal values of the order-up-to policy may be found. Finally, in Section 3.4 we discuss some 


relaxations of restrictions on the parameters. 


3.1 Optimality of an Order-up-to Policy 

The optimal form of the replenishment policy is based on the following definition. 

Definition 4 A replenishment policy is an order-up-to S =(S,,S,,....S\) replenishment policy 
if at retailer i the beginning inventory, after orders arrive and backorders are satisfied, is S, in 


every period. 


Note that due to the no-buildup assumption of the transshipment policy, an order-up-to S$ 
replenishment policy is regenerative whenever the replenishment policy is non-shortage 


inducing. On the other hand, if for some i, S, <0, then, at the end of the period, another retailer 
may make a transshipment to retailer i causing the pre-replenishment inventory level at retailer i 
to be strictly greater than S,. Since reducing inventory levels during the replenishment stage in 


our model is not allowed’, we cannot guarantee that a shortage inducing order-up-to S$ 
g g g p 


replenishment policy is regenerative. 


Theorem 1 There exists an order-up-to S =(S,,S,,....Sy) replenishment policy which is 


optimal within the class of non-shortage inducing replenishment policies for any stationary no- 


buildup transshipment policy. 


Proof : We begin the proof by defining and then analyzing a system, which is virtually identical 
to the system described above. In fact, the new system differs in only two aspects: 

1. At the end of the period, after holding and shortage costs are incurred, a retailer can either 
purchase or sell stock back to the supplier for the same price the stock can purchased at 
the beginning of the period. 

2. The stock level at each retailer at the end of the period is constrained to be zero, i.e., no 
inventory and no backorders are allowed. 


In all other aspects the two systems are identical in every way. 


Claim 1 Every replenishment policy in the original system has a corresponding replenishment 
policy in the new system with identical cost. 


If, in the original system, the end of period inventory level at retailer i is 7, and the 
replenishment quantity is 7, (thus incurring a replenishment cost of c,r,), then in the new system 
retailer i would, at the end of the previous period, sell back to the supplier /, units (or, if J, <0, 
purchase —J/, units) and order r,+/, units during the replenishment stage of the current period, 
thus incurring a cost of c,(—J, +r, + 1,;) =c,r,. The other aspects of the two systems are identical 


in every way. 


"In fact, reducing inventory levels below zero has no obvious physical meaning. 


Note that the converse of the claim is not true. In particular, using the supplier to reduce 
inventories in the new system is possible, whereas using the supplier to reduce inventories in the 
original system is impossible. 

Now let us examine the replenishment policy in the new system. In this newly defined 
system, since demands are stationary and independent across time periods and because the 
transshipment policy is stationary, the end of each period is a regeneration point. This means 
that, even though the planning horizon is infinite, the optimal replenishment decision in each and 


every time period is the same. In particular, we let S, be the optimal order quantity at retailer i 
in the new system and we also note that this replenishment policy is an order-up-to 
S =(S,,S,,....5,) replenishment policy. 

Recall that any order-up-to S replenishment policy is also feasible in the original system. 
Moreover, since the new system has strictly more feasible solutions this replenishment policy is 


also optimal in the original system, which completes the proof of the theorem. [| 


Corollary 1: /f Equation (2) holds for all retailers i and j, then the optimal transshipment policy 


has the no-buildup property. 


Building up inventory in the new system when Equation (2) holds is clearly sub-optimal. Since 
every no-buildup transshipment policy is feasible in the original system, we know that the 
optimal transshipment policy has the no-buildup property. 

Note that the transshipment policy need not be optimal (or even reasonable) for Theorem | to 
hold. In the next section, we show how to compute the optimal transshipment policy. However, 
if for some reason another transshipment policy is desired, e.g., grouping retailers into (possibly 
overlapping) pooling groups such that retailers only transship to other retailers in the same 


group, then Theorem 1| still holds. 


3.2 Determining the Optimal Transshipment Quantities 


Given an order-up-to S policy for the replenishment quantities, the optimal transshipment 
quantities need to be determined each period between every two retailers. To this end, we 
develop a linear cost network flow model of an arbitrary single period. The network flow model 


we develop is not the only one possible; indeed there exists a network flow representation with V 


10 


fewer nodes and WN fewer arcs than the one we present’. We choose to present this particular 
representation because it clearly reflects the events and actions in a period, implicitly showing 
the flow of time. 

Let us recall the events in this arbitrary period; in particular, let us examine the movement of 
material. At the beginning of the period, after orders arrive and backorders are satisfied, there 


are S, units in stock at each retailer i. This stock can be used in one of three different ways: 


satisfy demand at retailer 7, satisfy demand at retailer j (i.e., a transshipment from retailer i to /), 
and hold in inventory at retailer 7. While it is true that it is physically possible to move stock 
from retailer 7 to another retailer, e.g., j, for storage, this is precluded by the no-buildup 
assumption. 

At the end of the period units are on-order from the supplier. This material will be used in 
two different ways: to satisfy a backorder at a retailer or to build up inventory at a retailer so that 
the retailer will start the next period, after the order arrives and backorders are satisfied, 
with S, units in stock. The stock at the beginning of the period, after the order from the previous 
period arrives and backorders are satisfied, and the replenishment made during the current period 
are the only two sources of material. 

Let us now examine the material flow from the demand side (i.e., the sinks). The demand at 
retailer i, d,, can be satisfied in one of three different ways: from the inventory at retailer 7, from 
the inventory at another retailer j (1.e., through a transshipment from retailer j to retailer 7), or 
from replenishment during the current period (that arrives at the start of the next period). 
Another sink for material is the requirement that each retailer i begins the next period, after 
orders arrive and backorders are satisfied, with S, units. These units can come from one of two 
sources: the inventory at retailer i or replenishment during the period. As discussed above, 
inventory from another retailer will not be used to buildup inventory levels at retailer i. 

Using the observations above, we model the movement of stock during a period as a network 


flow problem. In particular, we have a source node, B,, to represent the beginning, i.e., initial 


inventory at retailer i, after orders arrive and backorders are satisfied, and a source node, R, to 


* We would like to thank the anonymous referee for pointing out the existence of the alternative network flow 


representation. 
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represent the replenishment that occurs in the period that arrives at the start of the next period. 


The sink node associated with the demand at retailer i will be denoted M,. Similarly, we will 
denote by E, the ending inventory at retailer 7, including units on order from the supplier. Note 


that this is equal to the inventory at the beginning of the next period, after orders arrive and 
backorders are satisfied. The arcs in the network flow problem are exactly those activities 
described above and are summarized (with their associated cost per unit flow) in Table 1. We 
use the letter ‘F’ to denote the flow in the network and subscripts to indicate the starting and 
ending nodes of the flow, thus F’, ,, is the Flow in the network from node B; to Mj. 

The complete network flow representation of the problem can be found in Figure | for four 
retailers. Note that the graph is bipartite, though our representation of the graph, which was 
chosen to show the connection to the underlying inventory problem, does not emphasize this 


characteristic. The LP formulation associated with this network flow problem is as follows: 
Problem (P) 


N N WN 
Z(S,d) = min Dh F ae, +> par aer +>) PF eu, 


i=l i=l j=l i=l 


S.t. 
N 
S/S Fag + > Fay > ae aon Mee (3) 
: 
N 
Fo, " DF i + Feu, =d, b= Tag (4) 
jai 
N N N 
i=) Peg t Fie (5) 
i=l i=l i=l 
Pye. + Fre = 58; i=1,...,N (6) 
Foe Pau, Pau, Feu, Pre 2 0 i= 1,...,N 5 J =1,...,N 


i 


Equations (3), (4), (5) and (6), respectively, represent the inventory balance constraint at the 


B,, M,, R,and E, nodes. 
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Table 1: The definition of the arcs in the network flow problem 


Arc Variable Cost per Meaning 


unit flow 
(B,,E,) Px ; inventory is held at retailer i 
(B,,M,) Pay 0 stock at retailer 7 is used to satisfy demand at retailer 7 
(B;,M ;) | Gs stock at retailer 7 is used to satisfy demand at retailer /, 1.e., 
‘ (c, =0) transshipment from retailer i to retailer j 
(R,M,) Fog ; shortage at retailer 7 is satisfied through replenishment 
(R,E,) Pe 0 inventory at retailer 7 is increased through replenishment 


Ss) 
vy S, 

S, 
uv SS 

5; 
y 53 

S, 
S 


d 


i 
i=l 


Figure 1: Network flow representation of a single period 


3.3 Finding the Optimal Order-up-to Values 


In the most general setting, exact computation of optimal order-up-to levels by analytical 
methods is difficult. We therefore use an approach based on Monte Carlo simulation: a random 


sample of demands are generated and the sample average is used in the optimization. In 
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particular, we deploy a sample-path optimization technique, IPA, to compute the optimal order- 
up-to levels. 

Glasserman (1991) established the general conditions for the unbiasedness of the IPA 
estimator. Applications of perturbation analysis have been reported in simulations of Markov 
chains (Glasserman 1992), inventory models (Fu 1994), manufacturing systems (Glasserman 
1994), finance (Fu and Hu 1997), and control charts for statistical process control (Fu and Hu 
1999). IPA-based methods have also been introduced to analyze supply chain problems 
(Glasserman and Tayur 1995). 

The idea is to use the expected value of the sample path derivative obtained via simulation 
instead of using the derivative of the expected cost in a gradient search method. In other words, 
the gradient of interest is dE[TC]/dS whereas our numerical procedure computes E[dTC/dS]. To 
validate this approach, that is, to justify the interchange of the derivative and the integral, we 


need to show that the objective function is jointly convex and “smooth” in the S, variables. 


To show that the expected cost is jointly convex in the decision variables, we first show that 


for a given demand, d, Z(S,d) is jointly convex in S. This is done by rewriting problem (P) 
such that all the S, variables appear on the ‘right hand side’. We then apply the result that the 


objective functions of linear programs are convex piecewise linear functions of their right hand 
sides (see, e.g., Bradley, Hax and Magnanti, 1977, page 697). Since the convolution of a convex 
function is itself convex we know that the expected cost in a single period, is itself jointly convex 
in S. 

It remains to show that the objective function is “smooth,” i.e., the derivatives are both 
continuous and bounded to validate our IPA estimators (which we formulate below). As 
illustrated in Lemma 3.2 of Glasserman and Tayur (1995), continuity and boundedness can be 
verified by establishing that inventories are, with probability 1, Lipschitz functions of the order- 
up-to levels, which is clearly the case here. Since the Lipschitz property is preserved by 
min/max operators, and addition, the derivatives of the total cost are also both bounded and 
continuous functions of the order-up-to levels. To summarize, since we established the 


smoothness of the objective function, our IPA estimators are guaranteed to be unbiased. 
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Description of the IPA Procedure 
The procedure starts with an arbitrary value for the order-up-to levels, S$. An instance of the 
demand is generated at each retailer. Note that any covariance structure is allowed in f(D). 


Once the demand is observed, problem (P) is solved in a deterministic fashion to compute the 
minimum-cost solution. The gradient of the total cost (derivatives with respect to the order-up-to 
levels) is estimated and accumulated over regenerative cycles; the average gradient value is then 
used to update the values of S. A thorough review of simulation-based stochastic optimization 
techniques can be found in Shapiro (2001). 

The procedure is summarized in a pseudo-code format, where K denotes the number of steps 


taken in a path search, U represents the number of regenerative cycles, a; represents the step size 


at iteration k, and S* represents the order-up-to level for retailer i at the k"" iteration: 


Algorithm 1 


Initialize K 
Initialize U 
Set k< 1 
For each retailer, set initial order-up-to levels, S’, possibly based on demand distribution 
Repeat 
SetdTC < 0 
Set u <0 
Repeat 
i. Generate an instance of the demand at each retailer, d, from f(D) 
li. Solve problem (P) to determine optimal transshipment quantities 
iii. Accumulate the desired gradients (derivatives) of the total cost, dTC 
iV. Uc ut 
Until u = U 
V. Calculate the desired gradient(s), dTC /U 
vi. Update the order-up-to-levels, S;; S* < S*' —a, (dTC,/U) 
vii. kek+1 
Until k= K 


In step (ii1) of the algorithm, we use IPA to compute the gradient. To illustrate the sample- 
path derivative idea, suppose that we end a period with inventory at retailer i. In this case, 
raising S; by 1 unit would result in increasing the total cost by h; In the computer 


implementation, for each retailer 7, we could partially code Step (iii) as: 


d7TC; = dTC; + hi, if inventory at retailer i is positive, at the end of Step (11). 
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Starting with d7C; = 0 for all i at the beginning of the simulation and dividing dTC; by U in Step 
(v) yield the derivative estimates. 

Our network flow formulation greatly simplifies computations. Increasing S; corresponds to 
increasing the supply at source node B; and the demand at sink node £;. From a network flow 


perspective, d7C/dS; = h,, if the are (B;,E;) 1s basic or, equivalently, the flow F,, 1s positive. 


If the arc is non-basic, then since any basic solution corresponds to a tree in the network, there 
exists a unique augmenting path from B; to E; whose total cost yields the gradient value. For 


example, the augmenting path may go from B; to Mj to R to E;, with an associated cost of c; - p;. 
Such a path represents a transshipment from retailer / to retailer j (with a cost of c;,), a reduction 
in backorders at retailer j (with a savings of p, ) and a purchase of another unit at retailer i (cost 


of zero). 

Furthermore, our implementation of the derivative computation in Step (iii) is very efficient. 
Since the value of the gradient is equal to the total cost along the unique path from B; to E; for 
each retailer i, this quantity can be calculated directly as the difference between the holding cost 
at retailer 7 and the reduced cost of the arc (B;,E;), which is readily available from the linear 
programming solution in Step (i1). 


In Step (vi) of the algorithm, one typically imposes conditions on the step size a, such that 
Ya, =oo and Ya? <00, 

k=l k=l 
For instance, a, =1/k satisfies these requirements. The first condition facilitates convergence by 
ensuring that the steps do not become too small too fast. However, if the algorithm is to 
converge, the step sizes must eventually become small, as ensured by the second condition. Note 


that when the gradient estimator is unbiased (as is the case here), step (vi) represents a Robbins- 


Monro algorithm (1951) for stochastic search. 


Sensitivity Analysis 
In a similar fashion, we can compute the derivative of the total cost with respect to other model 
parameters such as holding cost, penalty cost, transshipment cost, and replenishment cost. 


Furthermore, we can conduct performance analysis for service levels, expressed in terms of fill 
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rate both at a single retailer and system-wide. Some of these gradient estimators are illustrated in 
Table 2. 


Table 2: Other gradient estimators 


Derivative Estimator 
dTC/dh; a 
dTC/dp; Fou. 
dTC/d e; F, M, 
dTC/dc; Frag +>) Fp, -> ie 
j#i fei 


The derivative estimators are quite intuitive. For example, suppose that in the optimal 


solution to the network flow problem retailer 7 holds inventory at the end of a period ( F;,, > 0). 


An increase in the holding cost would therefore increase the total cost by the amount of excess 
stock being held. Similarly, if no excess inventory is held at retailer 7 at the end of a period, an 
increase in holding cost would have no impact on the total cost. 

Finally, we should point out that as long as the transshipment policy preserves the 
smoothness of the cost function with respect to the order-up-to levels, Algorithm 1 (with an 
appropriately defined method of obtaining the per period gradient information) can be used 
without modification. That is, the transshipment policy need not be optimal (as was also the case 


with the correctness of Theorem 1) if for some reason another transshipment policy is desired. 


3.4 Relaxing the Restrictions on the Parameters 

Several assumptions that are often made in the literature on transshipments and/or appear to be 
natural are not required for our model and analysis. These assumptions, some of which are 
typically referred to as triangle inequalities, are: 


i) c,; <h,+p,: Not requiring this inequality, 1.c., allowing c, >h, + p;, means that when one 


retailer has an inventory surplus and the other has backlog before transshipments, it is not 
necessarily economical to transfer a unit from the former to the latter. With two-location 
models, as well as with identical-location models, this inequality is needed to ensure that 
transshipments are economical (otherwise, no transshipments will ever occur). However, 


since we have a multi-location model with possibly non-identical costs this restriction is no 
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il) 


iii) 


iv) 


longer natural. (Clearly, if this inequality is not satisfied for all pairs of i and j, no 
transshipment will occur.) 


Pp, <¢,;+p,: Not requiring this inequality, ie., allowing p, >c, + p;, means that it may be 


economical to transship a unit from retailer 7 to retailer 7 even when retailer i herself has a 
shortage. Such a cost structure may occur when different retailers have different priorities, 
and therefore a retailer with a higher priority might have a (possibly significant) higher unit 
shortage cost. We would expect this inequality to hold in most practical situations. 

Cy Sy +Cy: Not requiring this inequality, 1.e., allowing c,, > Cj, +C >» Means that it may be 
economical to use retailer 7 as an intermediary point between retailer i and retailer k, rather 
than to transship it directly from retailer i to retailer k. We envision such a situation when 
transshipments have to be accomplished within a limited time. Then, retailers i and j may be 
close enough to allow transshipments, and similarly retailers j and k. However, the time to 


transfer goods between retailers i and k may be so large that c, is in essence infinite. 


When retailer j is used as an intermediary point the amount transshipped through it is 


limited to S;. Thus, it is incorrect to set c, =c,+c,. This point is illustrated in our 


computational study where this is the only difference between systems 3 and 4. 

Demands at different retailers in the same period are independent of one another. Not 
requiring this assumption means that in our model the demands among retailers in a given 
period may be correlated. Some of the existing transshipment literature could easily be 


extended to incorporate correlated demand, but the subject, in general, is not considered. 


As mentioned, our model and analysis can handle all the above relaxations and generalizations 


without any modification. 


4. Computational Study 


In this section, we report on our numerical study. We first report in Section 4.1 on a study 


conducted to validate our results and to fine-tune our algorithm. In Section 4.2, we describe the 


experimental design, which serves as the base case for all our experiments. In Section 4.3, we 


describe and analyze the results obtained for this basic experiment. In Sections 4.4 and 4.5, we 


18 


describe two other experiments, for correlated demand and non-identical costs, respectively, and 


describe and analyze their results. 


4.1 Validation and Fine-Tuning 
Recall that, in Step (v) of the algorithm, we incorporate our derivative estimates in a stochastic 
version of a gradient search technique. More specifically, for each retailer i we compute 


S‘ — §*1_a,(dTC,/U), where S* is the order-up-to level for retailer i at the kth iteration, a, 


is the step size, and (dTC/U) is the estimate of the gradient of the average cost when S*"' is the 


order-up-to level at retailer 7. 
Finding effective values for the algorithm parameters, that is, starting values for the order- 


up-to levels, S°, the step sizes, a,, and the termination criteria, is generally a difficult problem. 


We conducted a thorough search, experimenting with different strategies using the illustrative 
examples from Krishnan and Rao (1965) and Tagaras (1989), where optimal solutions are 
available. 

Based on this experimentation, we set the total number of steps for the path search K = 3000, 


the number of independent replications at each step U = 1000, and the step size a, = 1000/k for 


the validation examples. To estimate the gradient values experiments in Section 4.2 are 
conducted with U = 20,000 iterations. As a stopping criterion, we compared the order-up-to 
levels over 500 iterations and required that these values do not differ by more than 1. 
Ultimately, the maximum number of steps in the path search algorithm, K, was set at 10,000. In 
all of our experiments, the convergence criterion was satisfied long before 10,000 iterations. 
Each experiment has also been replicated. The reported results reflect the averages across 20 


independent replications. 


During the execution of the algorithm, the path search may push the order-up-to levels, S ig ; 


below zero. This is due to the step size, a;, being too large. Since a negative order-up-to level is 
not allowed by our assumption that the replenishment policy is non-shortage inducing, our 
algorithm simply resets their value to zero. We now illustrate our algorithm through Example | 
in Krishnan and Rao (1965) with seven retailers. The characteristics of the retailers are 


summarized in Table 3 along with the optimal order-up-to levels calculated by Krishnan and 
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Rao. Recall that all retailers have identical cost structures with a holding cost of $1 per unit, 
shortage cost of $4 per unit, and transshipment cost of $0.10 per unit. 

The last two rows of Table 3 depict the order-up-to levels computed by our algorithm. The 
half-width of a 95% confidence interval based on 20 independent replications is also reported to 
show the low variability of the IPA estimators. The initial values for the order-up-to levels were 


S?= 100 for all retailers. The experiments were conducted on a personal computer with a 3-GHz 


Pentium IV microprocessor. Figure 2a shows the convergence of the algorithm for the seven- 
retailer network. Figure 2b illustrates the convergence of the order-up-to level for retailer 7, to 
depict the convergence rate more clearly. Figure 3 shows the run times (expressed in terms of 


the wall clock time) for networks ranging from two to seven retailers. 


Table 3: Example | from Krishnan and Rao (1965) 


Retailer 1 2 3 4 5 6 7 
Normal Demand (44, c) | 100,20 200,50 150,30 170,50 180,40 170,30 170,50 
S; 106.7 216.7 160.0 186.7 193.4 180.0 186.7 
Computed Avg S; 106.9 2153 160.6 187.0 193.6 180.4 187.0 
AW of a 95% CI 0.0231 0.0202 0.0257 0.0224 0.0231 0.0224 0.0161 
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Figure 2a: Convergence of the algorithm for a 7-retailer configuration 
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Figure 3: Run time of the algorithm 


Note, from Figure 2, that convergence to the correct order-up-to levels is very rapid. Quick 
convergence was also observed in all network configurations with two to seven retailers. Also 
note that the results computed by the algorithm never deviate by more than 0.5% from the values 
reported in Krishnan and Rao. Similar convergence behavior was observed with the test problem 
taken from Tagaras (1989). We should point out that the computational time, between two to 
seven minutes for different numbers of retailers, is quite reasonable for a planning problem. 
Moreover, to obtain a rough estimate of the results even faster, e.g., for the purpose of a “what- 


if’ type of analysis, a limited number of iterations may be conducted (see Figure 2b). 
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4.2 Experimental Design 


To show the flexibility afforded by our modeling and analysis framework, we have experimented 
with large networks, with retailers whose demand is correlated, and with an arbitrary cost 
structure. We consider systems with N+1 retailers, where N € {7,9,...,21}. An illustrative 
example of the system with four retailers is shown in Figure 4. Let us call retailer 0 the central 
retailer and all other N retailers the remote retailers. We begin by considering the case of 
identical retailers, the cost parameters are as follows: h; = h = $1 per unit, p; = p = $4 per unit, 
and the basic direct transshipment cost, c; = $0.5 per unit, when transshipments are allowed. 
Each retailer faces an independent demand stream distributed uniformly over (0, 200). 

Note that co;, i=1,2,...,N, represents the transshipment cost from the central retailer to remote 
retailers, cio, i=1,2,...,N, represents the transshipment cost from the remote retailers to the central 
retailer, and cj, i,j=1,2,...,N, denotes the transshipment cost from remote retailer i to remote 
retailer 7. As summarized in Table 4, we consider five alternative system configurations and we 


denote by S; the order-up-to level for retailer i under system s, s=1,...,5. Note that cj = 


implies that transshipments are not allowed between retailers i and /. 

System 1, where no material movement is allowed among retailers, represents N+1 
independent newsvendor problems. It thus serves as a benchmark. In system 2, transshipments 
are allowed only from the central retailer to the remote retailers. System 3 extends the scenario 
in system 2 by allowing transshipments from the remote retailers to the central retailer as well. 
In system 4, all material movement is possible. However, transshipments between any two 
remote retailers are twice as expensive as the transshipments from/to the central retailer. Finally, 


all transshipment costs are identical in system 5. 


4.3 Results and Analysis for the Base Case 


The first set of experiments consists of configurations where the N remote retailers have identical 
cost parameters, and all retailers have independent and identically distributed demand. The 
order-up-to levels computed by our algorithm for the ten-retailer configuration are listed in Table 
5 and are depicted in Figure 5. As locations with identical characteristics are converging to the 
same number, we present the base stock level of the central retailer and the average base stock 
level of the remote retailers. The average total cost for the optimal configuration is also shown 


in Table 5. 
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Figure 4: Configuration with four retailers 


Table 4: System configurations 


System Coi Cio Cij 
1 oe) oe) oe) 
2 Ct oo) oe) 
3 Ct Ct oo 
4 Ct Ct 2c 
5 Ct Ct Cr 


The results of this set of experiments confirm the intuition about the behavior of the systems, 
as follows: in system 2, the central retailer carries considerably more inventory than the other 
retailers, since this stock can be transshipped to other retailers to meet the demand they face. 
Given the possibility of transshipments to/from the central retailer in system 3, we observe a 
reduction in inventory in the central retailer together with an increase in inventory at the other 
retailers. In system 4, this phenomenon is further accentuated. System 5, where transshipments 


are allowed among all retailers, distributes the inventory evenly throughout the system as in 
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system 1, but at a lower cost than the newsvendor benchmark of system 1. Comparing system 1, 
where we have ten independent newsvendors, with system 5, where transshipments are allowed 
among all retailers at the basic cost, we note that system-wide inventory is significantly reduced. 
For the ten-retailer configuration, this reduction in inventory leads to a 58% reduction in total 
costs, as shown in Figure 6. Note, however, that a large part of this benefit, a 39% reduction in 
total cost, is obtained when moving from system | to system 2, thus demonstrating that a little bit 
of flexibility goes a long way. 

Jordan and Graves (1995) showed qualitatively similar results in the context of process 
flexibility, defined as the ability to build different types of products in the same plant at the same 
time. In particular, they showed that limited flexibility, configured as a chain that connects 


products and plants, yields most of the benefits of total flexibility. 


Table 5: Optimal order-up-to levels in a ten-retailer configuration 


Average Total Cost 
System) So S;- So Inv. 

1 159.9 160.0 1600 800.64 
2 481.7 87.1 1266 486.95 
3 319.2 100.3 1222 410.67 
4 172.8 113.3 1193 383.81 
5 117.1 117.1 1171 334.66 
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Figure 5: Optimal order-up-to levels in a ten-retailer configuration 
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Figure 6: Average total cost under different systems for the ten-retailer configuration 


Figure 7 depicts the value of transshipments: for varying number of retailers considered, we 


observe that the expected total system cost decreases significantly as transshipments become 


more flexible and less expensive. 
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Figure 7: Average total cost under different systems for varying number of retailers 
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4.4 Correlated Demand 


To study the impact of correlated demand, we consider a ten-retailer configuration, with the 
same cost structure as described in the previous section. We experiment with scenarios of high 
(+0.9), medium (+0.5), and low (+0.2) levels of demand correlation. A case with zero correlation 
is also added for reference. Unlike the previous section, the demand faced by the retailers is 
modeled as a multivariate normal random variable with a mean of 100 and a standard deviation 
of 20. The (i,j)th entry of the variance-covariance matrix is given by 0;0;9;, where pj denotes 
the level of demand correlation being investigated when i #7 and one when i = j. Thus, for 
example, when we investigate medium negative correlation the diagonal elements of the 
variance-covariance matrix are all 400 and the off diagonal elements are all —200. 

Correlated demand can be found in many real situations. For example, positive correlation 
can be caused by some event common to all locations, e.g., rain causes demand for umbrellas to 
increase at all locations. Negative correlation, on the other hand, can be due to the fact that when 
the market is limited, a higher than expected demand at one location is associated with a lower 
than expected demand at another location. 

For system 1, where no transshipments take place, positively or negatively correlated 
demand has no impact on base stock levels or total cost as each retailer solves his own 
newsvendor problem, the solution of which is to order-up-to 116.8 units. When transshipments 
are allowed, however, correlated demand does have a sizeable impact. In general, positive 
correlation reduces the effectiveness of transshipments while negative correlation enhances it. In 
particular, with a high positive correlation, the difference among the five systems under 
consideration is relatively small. In particular, every system behaves similarly to system 1, in 
which transshipments are not allowed, and the objective function values of all systems are 
practically indistinguishable. The base stock and the average inventory levels in systems 2 


through 5 are shown in Figure 8. 
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Figure 8: Optimal order-up-to and average inventory levels for ten retailers with correlated 
demand 


In system 2, positive correlation limits the role of the central retailer as a clearinghouse for 
the remote retailers. Any level of negative correlation, on the other hand, reinforces the central 
retailer's clearinghouse role. This type of behavior is also observed in systems 3 and 4. The 
graph for system 5 is the most drastic illustration of how positive correlation reduces the 
effectiveness of transshipments. When demand has a high positive correlation then the average 
inventory is 116.1 units, which is very close to the System 1 level of 116.8 units. As the demand 
becomes negatively correlated, however, the ability to match demand with supply through 


transshipments is further enhanced. 
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Figure 9: Average total cost for ten retailers with correlated demand 


Figure 9 illustrates the impact of demand correlation on the average total cost for a 10- 
retailer configuration. High levels of positive correlation eliminate the value of transshipments 
making all five systems quite costly to operate. As demand correlation gets smaller (or 
negative), the effectiveness of transshipments in matching demand and supply is enhanced, 


which is reflected by the significantly lower average total cost of system 5. 


4.5 Non-Identical Costs 


In all configurations considered thus far, all remote retailers have had identical cost parameters. 
These cost parameters differed from the central retailer’s cost parameters with respect to the 
transshipment cost, as our solution technique can handle non-identical cost parameters. To 
further emphasize this ability, we consider a ten-retailer configuration, where we modify the cost 
parameters without violating equation 1, h;< cy + h; for all ij. In particular, we set ho = $1 as 
before, and h; = hj.; + 0.05, i=1,...,9. Similarly, po = $4 as before, and p; = p;-7 + 0.20, j=1,...,9. 
For system 1, where no transshipments are allowed, cj = +00. For system 2, co; = $0.5 and coj = 
coj-1 + 0.1, for j=2,...,9, and cj; = +00 otherwise. For system 3, cjo = co; of system 2, and cj = +00 
otherwise. For system 4, cjo and cg as in system 3 and cj2 = c2/= $1.2, cy = Ci-rj-1 + 0.2, 
ij=2,...,9, i#j. Finally, for system 5, cj = $0.5 for all i. 

We observe that transshipments maintain their positive impact on the overall performance, as 


systems 1,...,5 lead gradually to lower stock levels (Figure 10) and lower total cost (Figure 11). 
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The magnitude of this improvement, however, is heavily dependent on the relative cost 


parameters. 
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Figure 11: Average total cost for ten retailers with non-identical costs 
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5. Summary 


In this paper, we considered the multi-location dynamic transshipment problem. First, an 
arbitrary number of non-identical retailers is considered with possibly dependent stochastic 
demand. Second, we model the dynamic behavior of the system in an arbitrary period as a 
network flow problem. Finally, we employ a simulation-based method using infinitesimal 
perturbation analysis for optimization. Our simulation-based optimization approach therefore 
provides a flexible platform to analyze transshipment problems of arbitrary complexity. An 
interesting generalization to the problem addressed in this paper is the case of positive 
replenishment lead-times. In this case, it is not immediately clear how to find the optimal 
transshipment policy, as it may be beneficial for a retailer to hold back some of its own inventory 
rather than transship it. As a result, it is also not clear whether an order-up-to policy remains 


optimal. These will be interesting issues for future research. 
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